National Repository of Grey Literature 16 records found  1 - 10next  jump to record: Search took 0.01 seconds. 
Music Style Recognizer from mp3
Duchoň, Luboš ; Szőke, Igor (referee) ; Grézl, František (advisor)
This bachelor's thesis deals with detailed description of MP3 audio data format and music style recognizer. This recognizer is based on HTK Hidden Markov Models toolkit and coefficients obtained directly from MP3 files.
Detection of selected audio events in a real environment
Kowolowski, Alexander ; Burget, Radim (referee) ; Přinosil, Jiří (advisor)
This work deals with methods for the detection of dangerous events, in this case gunshots, in a real environment. First of all, a testing and training database of sounds from the MIVIA database was created. In this database, the files were contained in six versions of signal-to-noise ratio, so the subsequent testing of the selected methods took place for the various shuffled files, and it was found that some methods are more accurate for cleaner recordings than others, but less accurate for more noisy ones. For the typical feature extraction from the input sound, the mel-frequency cepstral coefficients method was always used. In the thesis, the methods of support vector machines and ensemble of a number of weak classifiers are gradually tested on the created databases. These methods are then further optimized, for example by using statistical variables, and after optimization they achieve better results, as expected. In the work, two scripts were created, where one created a training database and on this data trained the classifier and the other created the test database, tested the selected classifier and obtained the results. The results are processed by confusion matrix and several proportional variables such as accuracy, sensitivity, specificity and others are calculated. These results are always listed in the relevant chapter of the thesis in the tables and column charts and are properly commented on.
DEVELOPMENT OF ALGORITHMS FOR GUNSHOT DETECTION
Hrabina, Martin ; Tučková, Jana (referee) ; Počta, Peter (referee) ; Sigmund, Milan (advisor)
Táto práca sa zaoberá rozpoznávaním výstrelov a pridruženými problémami. Ako prvé je celá vec predstavená a rozdelená na menšie kroky. Ďalej je poskytnutý prehľad zvukových databáz, významné publikácie, akcie a súčasný stav veci spoločne s prehľadom možných aplikácií detekcie výstrelov. Druhá časť pozostáva z porovnávania príznakov pomocou rôznych metrík spoločne s porovnaním ich výkonu pri rozpoznávaní. Nasleduje porovnanie algoritmov rozpoznávania a sú uvedené nové príznaky použiteľné pri rozpoznávaní. Práca vrcholí návrhom dvojstupňového systému na rozpoznávanie výstrelov, monitorujúceho okolie v reálnom čase. V závere sú zhrnuté dosiahnuté výsledky a načrtnutý ďalší postup.
Acoustic Scene Classification from Speech
Grepl, Filip ; Beneš, Karel (referee) ; Matějka, Pavel (advisor)
This thesis deals with creating a system whose task is to recognize what type of location the recording was created at by analyzing the audio signal. The classifier is based on a multi-layer, fully connected neural network. The topology of the neural network is based on the baseline system provided for the DCASE competition. A dataset from this competition is also used for training and evaluating the neural network. The experiments are performed in particular with the representation of the properties of the audio records and with the format of the input data of the neural network. For this purpose, Mel-filter bank, block Mel-filter bank and MFCC flags are used. The experiments performed in this thesis brought a classification accuracy increased by 6.5 % compared to the baseline system of DCASE. Overall system success rate reached 67.5 %.
Automatic Recognition of Logopaedic Defect in Speech Utterances
Dušil, Lubomír ; Atassi, Hicham (referee) ; Smékal, Zdeněk (advisor)
The thesis is aimed at an analysis and automatic detection of logopaedic defects in speech utterance. Its objective is to facilitate and accelerate the work of logopaedists and to increase percentage of detected logopaedic defects in children of the youngest possible age followed by the most successful treatment. It presents methods of speech work, classification of the defects within individual stages of child development and appropriate words for identification of the speech defects and their subsequent remedy. After that there are analyses of methods of calculating coefficients which reflect human speech best. Also classifiers which are used to discern and determine whether it is a speech defect or not. Classifiers exploit coefficients for their work. Coefficients and classifiers are being tested and their best combination is being looked for in order to achieve the highest possible success rate of the automatic detection of the speech defects. All the programming and testing jobs has been conducted in the Matlab programme.
Online detection of simple voice commands in audiosignal
Zezula, Miroslav ; Březina, Lukáš (referee) ; Krejsa, Jiří (advisor)
This thesis describes the development of voice module, that can recognize simple speech commands by comparation of input sound with recorded templates. The first part of thesis contains a description of used algorithm and a verification of its functionality. The algorithm is based on Mel-frequency cepstral coefficients and dynamic time warping. Thereafter the hardware of voice module is designed, containing signal controller 56F805 from Freescale. The signal from microphone is conditioned by operational amplifiers and digital filter. The third part deals with the development of software for the controller and describes the fixed point implementation of the algorithm, respecting limited capabilities of the controller. Final test proves the usability of voice module in low-noise environment.
Music Style Recognition
Behúň, Kamil ; Polok, Lukáš (referee) ; Hradiš, Michal (advisor)
This thesis deals with the music style recognition. The introduction is an overview of current methods used in the music style recognition. Next chapters deals with the system created for the music style recognition. The final system is consists of two feature extraction methods. The first uses the Mel-frequency cepstral coefficients extraction from records and the second uses feature extraction from spectrograms of records. The final system uses Support Vector Machine for classifying.
Automatic classification of pronunciation of the letter „R“
Hrušovský, Enrik ; Vičar, Tomáš (referee) ; Harabiš, Vratislav (advisor)
This diploma thesis deals with automatic clasification of vowel R. Purpose of this thesis is to made program for detection of pronounciation of speech defects at vowel R in children. In thesis are processed parts as speech creation, speech therapy, dyslalia and subsequently speech signal processing and analysis methods. In the last part is designed software for automatic detection of pronounciation of vowel R. For recognition of pronounciation is used algorithm MFCC for extracting features. This features are subsequently classified by neural network to the group of correct or incorrect pronounciation and is evaluated classification success.
DEVELOPMENT OF ALGORITHMS FOR GUNSHOT DETECTION
Hrabina, Martin ; Tučková, Jana (referee) ; Počta, Peter (referee) ; Sigmund, Milan (advisor)
Táto práca sa zaoberá rozpoznávaním výstrelov a pridruženými problémami. Ako prvé je celá vec predstavená a rozdelená na menšie kroky. Ďalej je poskytnutý prehľad zvukových databáz, významné publikácie, akcie a súčasný stav veci spoločne s prehľadom možných aplikácií detekcie výstrelov. Druhá časť pozostáva z porovnávania príznakov pomocou rôznych metrík spoločne s porovnaním ich výkonu pri rozpoznávaní. Nasleduje porovnanie algoritmov rozpoznávania a sú uvedené nové príznaky použiteľné pri rozpoznávaní. Práca vrcholí návrhom dvojstupňového systému na rozpoznávanie výstrelov, monitorujúceho okolie v reálnom čase. V závere sú zhrnuté dosiahnuté výsledky a načrtnutý ďalší postup.
Acoustic Scene Classification from Speech
Grepl, Filip ; Beneš, Karel (referee) ; Matějka, Pavel (advisor)
This thesis deals with creating a system whose task is to recognize what type of location the recording was created at by analyzing the audio signal. The classifier is based on a multi-layer, fully connected neural network. The topology of the neural network is based on the baseline system provided for the DCASE competition. A dataset from this competition is also used for training and evaluating the neural network. The experiments are performed in particular with the representation of the properties of the audio records and with the format of the input data of the neural network. For this purpose, Mel-filter bank, block Mel-filter bank and MFCC flags are used. The experiments performed in this thesis brought a classification accuracy increased by 6.5 % compared to the baseline system of DCASE. Overall system success rate reached 67.5 %.

National Repository of Grey Literature : 16 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.